Support gqa in aten spda #2408

justinchuby · 2025-06-20T18:51:43Z

Signed-off-by: Justin Chu <justinchuby@users.noreply.github.com>

codecov · 2025-06-20T18:55:36Z

Codecov Report

❌ Patch coverage is 1.85185% with 53 lines in your changes missing coverage. Please review.
✅ Project coverage is 70.24%. Comparing base (03ab4c5) to head (16d75e9).
⚠️ Report is 94 commits behind head on main.

Files with missing lines	Patch %	Lines
onnxscript/function_libs/torch_lib/ops/nn.py	1.85%	48 Missing and 5 partials ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #2408      +/-   ##
==========================================
- Coverage   70.38%   70.24%   -0.14%     
==========================================
  Files         199      199              
  Lines       25223    25270      +47     
  Branches     2686     2693       +7     
==========================================
- Hits        17753    17751       -2     
- Misses       6541     6586      +45     
- Partials      929      933       +4

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Signed-off-by: Justin Chu <justinchuby@users.noreply.github.com>

onnxscript/function_libs/torch_lib/ops/nn.py

+        return _aten_scaled_dot_product_attention_bool_mask_onnx(
+            query, key, value, attn_mask, scale, dropout_p, enable_gqa=enable_gqa
+        )


To fix the issue, the keyword argument enable_gqa should be removed from the call to _aten_scaled_dot_product_attention_bool_mask_onnx on line 1994. This ensures that the function is called with only the parameters it supports. The removal of enable_gqa will not affect the functionality of _aten_scaled_dot_product_attention_bool_mask_onnx, as it does not use this argument.

onnxscript/function_libs/torch_lib/ops/nn.py

Signed-off-by: Justin Chu <justinchuby@users.noreply.github.com>

titaiwangms · 2025-09-08T17:09:41Z

onnxscript/function_libs/torch_lib/ops/nn.py

+            axis=0
+        )
+        value_unsqueezed = op.Unsqueeze(value, [-2])
+        value_tiled = op.Tile(value_unsqueezed, op.Concat(


op.Tile does not align to PyTorch inplementation.

if ( (q_num_heads != k_num_heads) and (q_num_heads % k_num_heads == 0) and (k_num_heads == v_num_heads) ): seq_reps = q_num_heads // k_num_heads # Interleave-repeat each KV head: [h0, h0, h1, h1, ...] K = np.repeat(K, repeats=seq_reps, axis=1) V = np.repeat(V, repeats=seq_reps, axis=1)

We should be able to reuse repeat_interleave here when it's done.

should we use expand for repeat interleave for simplicity over tile?

Yeah, https://github.com/onnx/onnx/blob/62f2facfc29b0b6a26247614d56e7c294a6206fc/onnx/defs/nn/defs.cc#L3840-L3856

I wonder if we can just adapt whatever function body is in defs.cc to torchlib? Is there any difference?

Probably not. I must have need using the old implementation

Support gpa in aten spda

c3f77f7

Signed-off-by: Justin Chu <justinchuby@users.noreply.github.com>

github-project-automation bot added this to ONNX Script Review Board Jun 20, 2025

github-project-automation bot moved this to Todo in ONNX Script Review Board Jun 20, 2025

justinchuby changed the title ~~Support gpa in aten spda~~ Support gqa in aten spda Jun 20, 2025

justinchuby added 2 commits June 20, 2025 12:08

Add gpa

06f20ee

Signed-off-by: Justin Chu <justinchuby@users.noreply.github.com>

Optional dropout

5a939cc

Signed-off-by: Justin Chu <justinchuby@users.noreply.github.com>

github-advanced-security bot found potential problems Jun 20, 2025

View reviewed changes

onnxscript/function_libs/torch_lib/ops/nn.py Fixed Show fixed Hide fixed

onnxscript/function_libs/torch_lib/ops/nn.py Fixed Show fixed Hide fixed

justinchuby marked this pull request as draft June 20, 2025 19:14

wip

16d75e9

Signed-off-by: Justin Chu <justinchuby@users.noreply.github.com>

justinchuby mentioned this pull request Sep 7, 2025

Implement enable_gqa in aten::scaled_dot_product_attention #1802

Open

titaiwangms reviewed Sep 8, 2025

View reviewed changes

justinchuby closed this Sep 11, 2025

github-project-automation bot moved this from Todo to Done in ONNX Script Review Board Sep 11, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support gqa in aten spda #2408

Support gqa in aten spda #2408

Uh oh!

justinchuby commented Jun 20, 2025 •

edited

Loading

Uh oh!

codecov bot commented Jun 20, 2025 •

edited

Loading

Uh oh!

Check failure

Copilot Autofix

Uh oh!

Uh oh!

titaiwangms Sep 8, 2025

Uh oh!

justinchuby Sep 8, 2025

Uh oh!

titaiwangms Sep 8, 2025

Uh oh!

justinchuby Sep 8, 2025

Uh oh!

Uh oh!

Support gqa in aten spda #2408

Support gqa in aten spda #2408

Uh oh!

Conversation

justinchuby commented Jun 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Jun 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Check failure

Copilot Autofix

Uh oh!

Uh oh!

titaiwangms Sep 8, 2025

Choose a reason for hiding this comment

Uh oh!

justinchuby Sep 8, 2025

Choose a reason for hiding this comment

Uh oh!

titaiwangms Sep 8, 2025

Choose a reason for hiding this comment

Uh oh!

justinchuby Sep 8, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

justinchuby commented Jun 20, 2025 •

edited

Loading

codecov bot commented Jun 20, 2025 •

edited

Loading